Extending the Coverage of DBpedia Properties using Distant Supervision over Wikipedia
نویسندگان
چکیده
DBpedia is a Semantic Web project aiming to extract structured data from Wikipedia articles. Due to the increasing number of resources linked to it, DBpedia plays a central role in the Linked Open Data community. Currently, the information contained in DBpedia is mainly collected from Wikipedia infoboxes, a set of subject-attribute-value triples that represents a summary of the Wikipedia page. These infoboxes are manually compiled by the Wikipedia contributors, and in more than 50% of the Wikipedia articles the infobox is missing. In this article, we use the distant supervision paradigm to extract the missing information directly from the Wikipedia article, using a Relation Extraction tool trained on the information already present in DBpedia. We evaluate our system on a data set consisting of seven DBpedia properties, demonstrating the suitability of the approach in extending the DBpedia coverage.
منابع مشابه
Extending DBpedia with List Structures in Wikipedia Articles
Ontologies are the basis of the Semantic Web. Owing to the cost of their construction and maintenance, however, there is much interest in automating their construction. Wikipedia is considered a promising source of knowledge because of its own characteristics. DBpedia extracts a large amount of ontological information from Wikipedia. However, DBpedia focuses exclusively on infoboxes (i.e., tabl...
متن کاملDistant Supervision for Relation Extraction Using Ontology Class Hierarchy-Based Features
Relation extraction is a key step in the problem of structuring natural language text. This paper demonstrates a multi-class classifier for relation extraction, constructed using the distant supervision approach, along with resources of the Semantic Web. In particular, the classifier uses a feature based on the class hierarchy of an ontology that, in conjunction with basic lexical features, imp...
متن کاملQuerying Multilingual DBpedia with QAKiS
We present an extension of QAKiS, a system for open domain Question Answering over linked data, that allows to query DBpedia multilingual chapters. Such chapters can contain different information with respect to the English version, e.g. they provide more specificity on certain topics, or fill information gaps. QAKiS exploits the alignment between properties carried out by DBpedia contributors ...
متن کاملExtending DBpedia with Wikipedia List Pages
Thanks to its wide coverage and general-purpose ontology, DBpedia is a prominent dataset in the Linked Open Data cloud. DBpedia’s content is harvested from Wikipedia’s infoboxes, based on manually created mappings. In this paper, we explore the use of a promising source of knowledge for extending DBpedia, i.e., Wikipedia’s list pages. We discuss how a combination of frequent pattern mining and ...
متن کاملFinding People's Professions and Nationalities Using Distant Supervision - The FMI@SU "goosefoot" team at the WSDM Cup 2017 Triple Scoring Task
We describe the system that our FMI@SU student’s team built for participating in the Triple Scoring task at the WSDM Cup 2017. Given a triple from a “type-like” relation, profession or nationality, the goal is to produce a score, on a scale from 0 to 7, that measures the relevance of the statement expressed by the triple: e.g., how well does the profession of an Actor fit for Quentin Tarantino?...
متن کامل